首页> 外文OA文献 >Mapping Large Scale Research Metadata to Linked Data: A Performance Comparison of HBase, CSV and XML
【2h】

Mapping Large Scale Research Metadata to Linked Data: A Performance Comparison of HBase, CSV and XML

机译:将大规模研究元数据映射到关联数据:一种表现   HBase,CsV和XmL的比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

OpenAIRE, the Open Access Infrastructure for Research in Europe, comprises adatabase of all EC FP7 and H2020 funded research projects, including metadataof their results (publications and datasets). These data are stored in an HBaseNoSQL database, post-processed, and exposed as HTML for human consumption, andas XML through a web service interface. As an intermediate format to facilitatestatistical computations, CSV is generated internally. To interlink theOpenAIRE data with related data on the Web, we aim at exporting them as LinkedOpen Data (LOD). The LOD export is required to integrate into the overall dataprocessing workflow, where derived data are regenerated from the base dataevery day. We thus faced the challenge of identifying the best-performingconversion approach.We evaluated the performances of creating LOD by aMapReduce job on top of HBase, by mapping the intermediate CSV files, and bymapping the XML output.
机译:OpenAIRE是欧洲的开放研究基础设施,由所有EC FP7和H2020资助的研究项目组成的数据库,包括其结果的元数据(出版物和数据集)。这些数据存储在HBaseNoSQL数据库中,进行后处理,并显示为HTML供人类使用,并通过Web服务接口显示为XML。作为方便统计计算的中间格式,CSV是在内部生成的。为了使OpenAIRE数据与Web上的相关数据相互链接,我们旨在将它们导出为LinkedOpen Data(LOD)。 LOD导出需要集成到整体数据处理工作流程中,每天从基础数据中重新生成派生数据。因此,我们面临着确定最佳转换方法的挑战。我们评估了通过在HBase上通过MapReduce作业,映射中间CSV文件并映射XML输出来创建LOD的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利